Goto

Collaborating Authors

 black box


The ascent of the AI therapist

MIT Technology Review

Four new books grapple with a global mental-health crisis and the dawn of algorithmic therapy. A technician adjusts the wiring inside the Mark I Perceptron. This early AI system was designed not by a mathematician but by a psychologist. More than a billion people worldwide suffer from a mental-health condition, according to the World Health Organization. The prevalence of anxiety and depression is growing in many demographics, particularly young people, and suicide is claiming hundreds of thousands of lives globally each year. Given the clear demand for accessible and affordable mental-health services, it's no wonder that people have looked to artificial intelligence for possible relief.


Beyond Concept Bottleneck Models: How to Make Black Boxes Intervenable?

Neural Information Processing Systems

Recently, interpretable machine learning has re-explored concept bottleneck models (CBM). An advantage of this model class is the user's ability to intervene on predicted concept values, affecting the downstream output. In this work, we introduce a method to perform such concept-based interventions on neural networks, which are not interpretable by design, only given a small validation set with concept labels. Furthermore, we formalise the notion of as a measure of the effectiveness of concept-based interventions and leverage this definition to fine-tune black boxes. Empirically, we explore the intervenability of black-box classifiers on synthetic tabular and natural image benchmarks. We focus on backbone architectures of varying complexity, from simple, fully connected neural nets to Stable Diffusion. We demonstrate that the proposed fine-tuning improves intervention effectiveness and often yields better-calibrated predictions. To showcase the practical utility of our techniques, we apply them to deep chest X-ray classifiers and show that fine-tuned black boxes are more intervenable than CBMs. Lastly, we establish that our methods are still effective under vision-language-model-based concept annotations, alleviating the need for a human-annotated validation set.


Is this the Right Neighborhood? Accurate and Query Efficient Model Agnostic Explanations

Neural Information Processing Systems

There have been multiple works that try to ascertain explanations for decisions of black box models on particular inputs by perturbing the input or by sampling around it, creating a neighborhood and then fitting a sparse (linear) model (e.g.


DOCTOR: A Simple Method for Detecting Misclassification Errors

Neural Information Processing Systems

Deep neural networks (DNNs) have shown to perform very well on large scale object recognition problems and lead to widespread use for real-world applications, including situations where DNN are implemented as "black boxes". A promising approach to secure their use is to accept decisions that are likely to be correct while discarding the others. In this work, we propose DOCTOR, a simple method that aims to identify whether the prediction of a DNN classifier should (or should not) be trusted so that, consequently, it would be possible to accept it or to reject it. Two scenarios are investigated: Totally Black Box (TBB) where only the soft-predictions are available and Partially Black Box (PBB) where gradient-propagation to perform input pre-processing is allowed. Empirically, we show that DOCTOR outperforms all state-of-the-art methods on various well-known images and sentiment analysis datasets. In particular, we observe a reduction of up to 4% of the false rejection rate (FRR) in the PBB scenario. DOCTOR can be applied to any pre-trained model, it does not require prior information about the underlying dataset and is as simple as the simplest available methods in the literature.


A survey and benchmark of high-dimensional Bayesian optimization of discrete sequences Miguel González-Duque

Neural Information Processing Systems

Optimizing discrete black box functions is key in several domains, e.g. protein engineering and drug design. Due to the lack of gradient information and the need for sample efficiency, Bayesian optimization is an ideal candidate for these tasks. Several methods for high-dimensional continuous and categorical Bayesian optimization have been proposed recently. However, our survey of the field reveals highly heterogeneous experimental set-ups across methods and technical barriers for the replicability and application of published algorithms to real-world tasks. To address these issues, we develop a unified framework to test a vast array of high-dimensional Bayesian optimization methods and a collection of standardized black box functions representing real-world application domains in chemistry and biology.



Reviewer

Neural Information Processing Systems

We greatly appreciate that both R1 and R2 consider our paper to be well-written/clearly presented. On CIFAR-10, the boundary attack achieves a final average MSE of 0.009 (which FPR = 0.1, which is much higher than what we've obtained on white-box attacks (Table 2). FPR setting we used in our experiments. ImageNet dataset, which few works have experimented or succeeded on (including Madry's, which only evaluates on We hope that this aspect of our study can be appreciated. While the two properties are indeed known, our observations and analysis (cf.


From Black Box to Bijection: Interpreting Machine Learning to Build a Zeta Map Algorithm

Huang, Xiaoyu, Jackson, Blake, Lee, Kyu-Hwan

arXiv.org Artificial Intelligence

There is a large class of problems in algebraic combinatorics which can be distilled into the same challenge: construct an explicit combinatorial bijection. Traditionally, researchers have solved challenges like these by visually inspecting the data for patterns, formulating conjectures, and then proving them. But what is to be done if patterns fail to emerge until the data grows beyond human scale? In this paper, we propose a new workflow for discovering combinatorial bijections via machine learning. As a proof of concept, we train a transformer on paired Dyck paths and use its learned attention patterns to derive a new algorithmic description of the zeta map, which we call the \textit{Scaffolding Map}.